Decorrelating Feature Spaces for Learning General-Purpose Audio Representations

نویسندگان

چکیده

We introduce DECAR, a self-supervised pre-training approach for learning general-purpose audio representations. Our system is based on clustering: it utilizes an offline clustering step to provide target labels that act as pseudo-labels solving prediction task. develop top of recent advances in computer vision and design lightweight, easy-to-use scheme. pre-train DECAR embeddings balanced subset the large-scale Audioset dataset transfer those representations 9 downstream classification tasks, including speech, music, animal sounds, acoustic scenes. Furthermore, we conduct ablation studies identifying key choices also make all our code pre-trained models publicly available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Representations for Neuromorphic Audio Spike Streams

Event-driven neuromorphic spiking sensors such as the silicon retina and the silicon cochlea encode the external sensory stimuli as asynchronous streams of spikes across different channels or pixels. Combining state-of-art deep neural networks with the asynchronous outputs of these sensors has produced encouraging results on some datasets but remains challenging. While the lack of effective spi...

متن کامل

A General-Purpose Implementation of Conceptual Spaces

The highly influential framework of conceptual spaces provides a geometric way of representing knowledge. Instances are represented by points and concepts are represented by regions in a highdimensional space. Based on our recent formalization, we present a generalpurpose implementation of the conceptual spaces framework that is not only capable of representing concepts with inter-domain correl...

متن کامل

Toward General-Purpose Learning for Information Extraction

Two trends are evident in the recent evolution of the field of information extraction: a preference for simple, often corpus-driven techniques over linguistically sophisticated ones; and a broadening of the central problem definition to include many non-traditional text domains. This development calls for information extraction systems which are as retctrgetable and general as possible. Here, w...

متن کامل

Costs of General Purpose Learning

Leo Harrington surprisingly constructed a machine which can learn any computable function f according to the following criterion (called Bc-identification). His machine, on the successive graph points of f , outputs a corresponding infinite sequence of programs p0, p1, p2, . . ., and, for some i, the programs pi, pi+1, pi+2, . . . each compute a variant of f which differs from f at only finitel...

متن کامل

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on large amounts of text in an unsupervised manner. These representations are typically used as general purpose features for words across a range of NLP problems. However, extending this success to learning representations of sequences of words, such as sentenc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Journal of Selected Topics in Signal Processing

سال: 2022

ISSN: ['1941-0484', '1932-4553']

DOI: https://doi.org/10.1109/jstsp.2022.3202093